Skip to content

Conversation

@kanoshiou
Copy link
Contributor

Optimize date grouping with formatting in ESQL

This PR optimizes the performance of date grouping operations in ESQL by automatically converting DATE_FORMAT in GROUP BY clauses to more efficient DATE_TRUNC operations. The optimization:

  • Automatically detects and converts DATE_FORMAT patterns to equivalent DATE_TRUNC intervals
  • Moves date formatting from the grouping phase to a subsequent EVAL phase
  • Handles timezone and DST transitions correctly
  • Supports various time intervals from nanoseconds to years

Example optimization:

FROM test
| STATS avg = AVG(salary) BY date = DATE_FORMAT("yyyy-MM", hire_date)

becomes:

FROM test
| STATS avg = AVG(salary) BY date1 = DATE_TRUNC(1 month, hire_date) 
| EVAL date = DATE_FORMAT("yyyy-MM", date1) 
| KEEP avg, date

Optimized plan:

Project[[avg{r}#7, date{r}#4]]
\_Eval[[$$SUM$avg$0{r$}#21 / $$COUNT$avg$1{r$}#22 AS avg#7, DATEFORMAT([79 79 79 79 2d 4d 4d][KEYWORD],$$DATE_FORMAT(
"yy>$date$0{r$}#20) AS date#4]]
  \_Limit[1000[INTEGER],false]
    \_Aggregate[[$$DATE_FORMAT("yy>$date$0{r$}#20],[SUM(salary{f}#14,true[BOOLEAN]) AS $$SUM$avg$0#21, COUNT(salary{f}#14,true[
BOOLEAN]) AS $$COUNT$avg$1#22, $$DATE_FORMAT("yy>$date$0{r$}#20]]
      \_Eval[[DATETRUNC(P1M[DATE_PERIOD],hire_date{f}#16) AS $$DATE_FORMAT("yy>$date$0#20]]
        \_EsRelation[test][_meta_field{f}#15, emp_no{f}#9, first_name{f}#10, g..]

Closes #114772

@elasticsearchmachine elasticsearchmachine added v9.1.0 needs:triage Requires assignment of a team area label external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jun 11, 2025
@ivancea ivancea added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL and removed needs:triage Requires assignment of a team area label labels Jun 19, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

…tting

# Conflicts:
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/ReplaceAggregateNestedExpressionWithEval.java
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java
@kanoshiou
Copy link
Contributor Author

Thanks for your review, @fang-xing-esql! I’ve updated the branch based on your comments.

I revised the logic for inferring the minimal time interval corresponding to a given date format. Note that date_nanos cannot be optimized, as nanosecond intervals are not supported by DATE_TRUNC. I also updated LogicalPlanOptimizerTests so that this optimization supports inline stats, and added additional tests covering the entire change.

I’d appreciate it if you could take another look when you have time.

@fang-xing-esql
Copy link
Member

buildkite test this

Copy link
Member

@fang-xing-esql fang-xing-esql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding more tests and handle inlinestats @kanoshiou! I added some comment around the casting of the format to literal.

I'll need to look into the change related to inline stats more, as it seems a bit more complicated than I had expected. This change created an eval on top of aggregate, which makes inline stats a bit tricky.

And I'm wondering if you could elaborate a bit more about why this transformation does not apply to date_nanos? I tried some queries on date_nanos, and they seem work fine, the changes to the rule doesn't block date_nanos. Please find my experiments below(the 2nd and 4th query show equivalent results). Perhaps there is something that I missed here.

+ curl -u elastic:password -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
  "query": "from test1 | eval x= date_format(\"y-MM-dd\", nanos)"
}
'
         millis         |                                            nanos                                            |        num        |       x       
------------------------+---------------------------------------------------------------------------------------------+-------------------+---------------
2023-10-23T13:55:01.543Z|2023-10-23T13:55:01.543123456Z                                                               |1698069301543123456|2023-10-23     
2023-10-23T13:55:01.543Z|2023-10-23T12:55:01.543123456Z                                                               |1698069301543123456|2023-10-23     
1999-10-23T12:15:03.360Z|[2023-01-23T13:55:01.543123456Z, 2023-02-23T13:33:34.937193Z, 2023-03-23T12:15:03.360103847Z]|0                  |null           
+ curl -u elastic:password -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
  "query": "from test1 | stats count(*) by date_format(\"y-MM-dd\", nanos)"
}
'
   count(*)    |date_format("y-MM-dd", nanos)
---------------+-----------------------------
1              |null                         
2              |2023-10-23                                     
+ curl -u elastic:password -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
  "query": "from test1 | eval x= date_trunc(1 day, nanos), y= date_format(\"y-MM-dd\", x)"
}
'
         millis         |                                            nanos                                            |        num        |           x            |       y       
------------------------+---------------------------------------------------------------------------------------------+-------------------+------------------------+---------------
2023-10-23T13:55:01.543Z|2023-10-23T13:55:01.543123456Z                                                               |1698069301543123456|2023-10-23T00:00:00.000Z|2023-10-23     
2023-10-23T13:55:01.543Z|2023-10-23T12:55:01.543123456Z                                                               |1698069301543123456|2023-10-23T00:00:00.000Z|2023-10-23     
1999-10-23T12:15:03.360Z|[2023-01-23T13:55:01.543123456Z, 2023-02-23T13:33:34.937193Z, 2023-03-23T12:15:03.360103847Z]|0                  |null                    |null           
+ curl -u elastic:password -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
  "query": "from test1 | stats count(*) by x = date_trunc(1 day, nanos) | eval y= date_format(\"y-MM-dd\", x)"
}
'
   count(*)    |           x            |       y       
---------------+------------------------+---------------
1              |null                    |null           
2              |2023-10-23T00:00:00.000Z|2023-10-23  

evals.add(as);
if (asChild instanceof DateFormat df) {
// Extract the format pattern and field from DateFormat
Literal format = (Literal) df.format();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some cases when the format is not a constant value, this cast might not be safe here, for example this query below will fail. The concat function might need to be evaluated before this transformation from date_format to date_trunc.

+ curl -u elastic:password -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
  "query": "from sample_data | eval format = concat(\"yyyy\", \"mm\") | stats count(*) by date_format(format, @timestamp)"
}
'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "class_cast_exception",
        "reason" : "class org.elasticsearch.xpack.esql.core.expression.ReferenceAttribute cannot be cast to class org.elasticsearch.xpack.esql.core.expression.Literal (org.elasticsearch.xpack.esql.core.expression.ReferenceAttribute and org.elasticsearch.xpack.esql.core.expression.Literal are in unnamed module of loader java.net.URLClassLoader @de81be1)"
      }
    ],
    "type" : "class_cast_exception",
    "reason" : "class org.elasticsearch.xpack.esql.core.expression.ReferenceAttribute cannot be cast to class org.elasticsearch.xpack.esql.core.expression.Literal (org.elasticsearch.xpack.esql.core.expression.ReferenceAttribute and org.elasticsearch.xpack.esql.core.expression.Literal are in unnamed module of loader java.net.URLClassLoader @de81be1)"
  },
  "status" : 500
}

…tting

# Conflicts:
#	x-pack/plugin/esql/qa/testFixtures/src/main/resources/inlinestats.csv-spec
#	x-pack/plugin/esql/qa/testFixtures/src/main/resources/stats.csv-spec
#	x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/rules/logical/PropagateInlineEvals.java
@kanoshiou
Copy link
Contributor Author

Thanks for your review @fang-xing-esql!

I thought you were referring to formatting nanosecond intervals using date_format. This transformation also applies to the date_nanos type, and I’ve updated the branch and added additional tests to cover both the date_nanos case and scenarios where the format is a ReferenceAttribute.

Copy link
Member

@fang-xing-esql fang-xing-esql left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing the previous round of comments and for being patient, @kanoshiou! I went through the code changes again, and for most queries, the PR does a great job transforming date_format to date_trunc correctly — really appreciate the effort.

I’ve added a few comments in the code suggesting minor updates to the tests and additional explanations in areas that are a bit tricky to follow. Since aggregation can get complex and tricky, having broader test coverage will help ensure that future changes don’t unintentionally alter expected behavior.

I have tried some normal queries, they work as expected. And I'm trying to push it harder with a bit more tricky queries, and came across these situations. In ES|QL , we allow the flexibility that grouping can be referenced in the aggregate explicitly, and sometime this flexibility adds extra complexity. These two queries below don't error out in main.

+ curl -u elastic:password -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
  "query": "from test1 | stats count(*), concat(x, \"01\") by x = date_format(\"y-MM-dd\", nanos)"
}
'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "null_pointer_exception",
        "reason" : "Cannot invoke \"org.elasticsearch.xpack.esql.planner.Layout$ChannelAndType.channel()\" because the return value of \"org.elasticsearch.xpack.esql.planner.Layout.get(org.elasticsearch.xpack.esql.core.expression.NameId)\" is null"
      }
    ],
    "type" : "null_pointer_exception",
    "reason" : "Cannot invoke \"org.elasticsearch.xpack.esql.planner.Layout$ChannelAndType.channel()\" because the return value of \"org.elasticsearch.xpack.esql.planner.Layout.get(org.elasticsearch.xpack.esql.core.expression.NameId)\" is null"
  },
  "status" : 500
}
+ curl -u elastic:password -X POST 'localhost:9200/_query?format=txt&pretty' -H 'Content-Type: application/json' '-d
{
  "query": "from test1 | inline stats count(*), concat(x, \"01\") by x = date_format(\"y-MM-dd\", nanos)"
}
'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "illegal_state_exception",
        "reason" : "Found 1 problem\nline 1:14: Plan [Eval[[CONCAT(x{r}#106,01[KEYWORD]) AS concat(x, \"01\")#103]]] optimized incorrectly due to missing references [x{r}#106]"
      }
    ],
    "type" : "illegal_state_exception",
    "reason" : "Found 1 problem\nline 1:14: Plan [Eval[[CONCAT(x{r}#106,01[KEYWORD]) AS concat(x, \"01\")#103]]] optimized incorrectly due to missing references [x{r}#106]"
  },
  "status" : 500
}

…tting

# Conflicts:
#	x-pack/plugin/esql/qa/testFixtures/src/main/resources/inlinestats.csv-spec
#	x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java
@kanoshiou
Copy link
Contributor Author

I sincerely appreciate the time and effort you put into reviewing this, @fang-xing-esql! Thanks for your review! Your edge cases helped me deepen my understanding of ES|QL. I've updated the branch based on your comments - thanks again for the insightful feedback, and I’d appreciate it if you could take another look when you have time.

Regarding the case where the grouping is referenced within the aggregate, I’ve given it some thought. In such a scenario, it seems that the optimization cannot be applied, since the aggregate operation requires the formatted field to be present upfront. What do you think? I’d appreciate any feedback or thoughts you might have on this.

@fang-xing-esql
Copy link
Member

buildkite test this

# Conflicts:
#	x-pack/plugin/esql/qa/testFixtures/src/main/resources/stats.csv-spec
@fang-xing-esql
Copy link
Member

buildkite test this

@fang-xing-esql
Copy link
Member

buildkite test this

@astefan astefan self-requested a review October 21, 2025 05:55
@fang-xing-esql
Copy link
Member

fang-xing-esql commented Oct 30, 2025

@kanoshiou First of all, we truly appreciate the effort you put into this PR as always. From a functional standpoint, I couldn't find any more issues or bugs at this moment. However, this change is not a simple bug fix, it’s more of a feature that could have potential performance implications. After internal discussion within the team carefully, we would recommend pausing and not merging this PR for now, considering the following aspects:

  • The changes introduced in this PR are non-trivial, two key aggregation related logical planner rules are modified. Rewriting date_format to date_trunc may have a measurable performance impact. Features that could affect query performance typically need to go through performance regression testing to detect any potential degradations before merging. If no existing benchmarks cover the relevant query patterns, new benchmarks are required. We came across cases in the past where performance impacting PRs were merged without such testing, leading to noticeable regressions that took considerable time to resolve. To be cautious, we’re not ready to merge this PR yet from a performance perspective.

  • We are also actively working on timezone support in ES|QL, which is also complex. It’s currently unclear how this rewrite would interact with timezone. Given that timezone support is a higher priority at the moment, we’d prefer to focus on completing that work first and then revisit this feature afterward.

Overall, the logic changes in these two rules make sense, we like it, and we’d like to revisit this PR in the near future once we have a clearer understanding of its interaction with timezone support and potential performance implications. Thank you very much for your understanding, and we hope it make sense to you as well, don't hesitate to reach out to us for any questions.

…tting

# Conflicts:
#	x-pack/plugin/esql/qa/testFixtures/src/main/resources/inlinestats.csv-spec
@kanoshiou
Copy link
Contributor Author

Thanks for the detailed feedback and I completely understand your concerns, @fang-xing-esql.

I'm considering adding some benchmarks for this PR. However, after reviewing the existing code under the benchmark package, I couldn’t find any examples that directly apply to full ES|QL query execution. From what I can tell, it might be necessary to introduce a new benchmark specifically designed to run complete ES|QL queries end-to-end. Does that sound reasonable to you?

As for timezone handling, I believe this PR should be fine as long as both date_trunc and date_format support timezone correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ESQL: optimize date grouping with formatting

4 participants